[GPU] Recognize parameters as valid inputs for compressed weights #32276

mdvoretc-intel · 2025-10-02T12:10:00Z

Details:

The change allows parameters to be recognized alongside constants as valid weight inputs for transformations producing FullyConnectedCompressed nodes

Description of the issue:

At present, the FC_COMPRESSED_WEIGHT_PATTERN macro contains a pattern for dequantization of a constant integer weight. This pattern is used to recognize and fold cases where fused weight dequantization can be used, replacing them with FullyConnectedCompressed nodes. Due to expecting a constant weight input, this pattern fails to recognize quantized LoRA weights, which are provided as parameters:

With the changes in this patch, these weights can be recognized, and the transformations can proceed and produce nodes that would then leverage oneDNN fused QGEMM for execution:

Tickets:

CVS-172090

mdvoretc-intel · 2025-10-28T16:15:42Z

build_jenkins

mdvoretc-intel · 2025-10-29T11:36:33Z

build_jenkins

mklimenk

Two branches of the if (pattern_map.count(weights_const_m)) { condition share a lot of similarities, please consider refactoring it to avoid code duplication

src/plugins/intel_gpu/src/plugin/transformations/convert_fc_to_compressed.cpp

This change enables use of quantized LoRA weights, passed as parameters during execution, to be recognized by the transformaions that produce FullyConnectedCompressed nodes for QGEMM execution.

The test previously expected the transformation to fail due to the use of input2 as a weight. The new logic allows use of parameters as weights, so the test has been adjusted to expect a successful transformation.

mklimenk

Looks much cleaner now, thanks!

mdvoretc-intel · 2025-11-03T16:19:39Z

@CuriousPanCake please review.

mdvoretc-intel · 2025-11-11T14:25:49Z

@CuriousPanCake please review.

github-actions bot added category: GPU OpenVINO GPU plugin category: transformations OpenVINO Runtime library - Transformations labels Oct 2, 2025

sys-openvino-ci added the ExternalIntelPR External contributor from Intel label Oct 2, 2025

mdvoretc-intel force-pushed the param_quant_weight branch from 522f237 to 6a6a649 Compare October 28, 2025 16:14

mdvoretc-intel force-pushed the param_quant_weight branch from 6a6a649 to 16d03b4 Compare October 29, 2025 11:35

mdvoretc-intel marked this pull request as ready for review October 29, 2025 11:36

mdvoretc-intel requested review from a team as code owners October 29, 2025 11:36

mdvoretc-intel requested review from CuriousPanCake and removed request for a team October 29, 2025 11:36

mdvoretc-intel force-pushed the param_quant_weight branch 2 times, most recently from b82b902 to 476f80f Compare October 30, 2025 09:25

mklimenk reviewed Oct 31, 2025

View reviewed changes

src/plugins/intel_gpu/src/plugin/transformations/convert_fc_to_compressed.cpp Outdated Show resolved Hide resolved

mdvoretc-intel added 2 commits November 3, 2025 13:40

[gpu] Recognize parameters as valid inputs for compressed weights

f557a05

This change enables use of quantized LoRA weights, passed as parameters during execution, to be recognized by the transformaions that produce FullyConnectedCompressed nodes for QGEMM execution.

Adjust ConvertMatMulToFullyConnectedExceptionTest_sibling_matmul

1c45696

The test previously expected the transformation to fail due to the use of input2 as a weight. The new logic allows use of parameters as weights, so the test has been adjusted to expect a successful transformation.

mdvoretc-intel force-pushed the param_quant_weight branch from e9e889a to 1c45696 Compare November 3, 2025 13:40

Address review comments

a714620

mklimenk reviewed Nov 3, 2025

View reviewed changes

Merge branch 'master' into param_quant_weight

53d9074

github-actions bot removed the category: transformations OpenVINO Runtime library - Transformations label Nov 11, 2025

Restore convert_matmul_to_fc change after refactor

ea48b32

CuriousPanCake requested review from evkotov and mryzhov November 13, 2025 12:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[GPU] Recognize parameters as valid inputs for compressed weights #32276

[GPU] Recognize parameters as valid inputs for compressed weights #32276

mdvoretc-intel commented Oct 2, 2025

Uh oh!

mdvoretc-intel commented Oct 28, 2025

Uh oh!

mdvoretc-intel commented Oct 29, 2025

Uh oh!

mklimenk left a comment

Uh oh!

Uh oh!

mklimenk left a comment

Uh oh!

mdvoretc-intel commented Nov 3, 2025

Uh oh!

mdvoretc-intel commented Nov 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[GPU] Recognize parameters as valid inputs for compressed weights #32276

Are you sure you want to change the base?

[GPU] Recognize parameters as valid inputs for compressed weights #32276

Conversation

mdvoretc-intel commented Oct 2, 2025

Details:

Description of the issue:

Tickets:

Uh oh!

mdvoretc-intel commented Oct 28, 2025

Uh oh!

mdvoretc-intel commented Oct 29, 2025

Uh oh!

mklimenk left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mklimenk left a comment

Choose a reason for hiding this comment

Uh oh!

mdvoretc-intel commented Nov 3, 2025

Uh oh!

mdvoretc-intel commented Nov 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants